In this part of the workshop we will talk about raster data. We will
start with an introduction of the fundamental principles, packages, and
metadata needed to work with raster data in R. We will discuss some of
the most important metadata elements, including CRS and resolution. We
continue to work with the tidyverse and here
packages and we will use two additional packages to work with raster
data: raster and rgdal.
The dataset we chose is a set of GeoTIFF files on Delft or subdivisions of it. The files were included in the archive you downloaded before the workshop.
The GeoTIFF format contains a set of embedded tags with metadata
about the raster data. We can use the function GDALinfo() from the
rgdal package to get information about our raster data
before we read that data into R. It is recommended to do this before
importing your data.
GDALinfo(here("data","tud-dsm-5m.tif"))
## rows 386
## columns 722
## bands 1
## lower left origin.x 83565
## lower left origin.y 445250
## res.x 5
## res.y 5
## ysign -1
## oblique.x 0
## oblique.y 0
## driver GTiff
## projection +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889
## +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## file /Users/claudiuforgaci/Projects/geospatial-data-carpentry-tud-2023-06/data/tud-dsm-5m.tif
## apparent band summary:
## GDType hasNoDataValue NoDataValue blockSize1 blockSize2
## 1 Float32 FALSE 0 2 722
## apparent band statistics:
## Bmin Bmax Bmean Bsd
## 1 -5.39069 92.08102 2.588572 5.45896
## Metadata:
## AREA_OR_POINT=Area
Now that we’ve previewed the metadata for our GeoTIFF, let’s import
this raster dataset. We are going to work with a Digital Surface Model
(DSM) which is in the GeoTIFF format. We can use the
raster() function to import a raster file.
DSM_TUD <- raster(here("data","tud-dsm-5m.tif"))
DSM_TUD
## class : RasterLayer
## dimensions : 386, 722, 278692 (nrow, ncol, ncell)
## resolution : 5, 5 (x, y)
## extent : 83565, 87175, 445250, 447180 (xmin, xmax, ymin, ymax)
## crs : +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## source : tud-dsm-5m.tif
## names : tud.dsm.5m
## values : -5.39069, 92.08102 (min, max)
Similar to other data structures in R like vectors and data frame
columns, descriptive statistics for raster data can be retrieved with
the summary() function.
summary(DSM_TUD)
## tud.dsm.5m
## Min. -5.2235179
## 1st Qu. -0.6958068
## Median 0.5665370
## 3rd Qu. 4.4967444
## Max. 89.7837830
## NA's 0.0000000
But note the warning. Unless you force R to calculate these
statistics using every cell, it will take a random sample of 100,000
cells and calculate from them instead. Now, raster objects are not data
frames so you cannot count the cells with nrow(), but with
the ncell() function of the raster
package.
summary(DSM_TUD, maxsamp = ncell(DSM_TUD))
## tud.dsm.5m
## Min. -5.3906898
## 1st Qu. -0.7008305
## Median 0.5572625
## 3rd Qu. 4.4648066
## Max. 92.0810165
## NA's 0.0000000
To visualise raster data in ggplot2, we will need to
convert it to a data frame. The raster package has a
as.data.frame function for that purpose.
DSM_TUD_df <- as.data.frame(DSM_TUD, xy = TRUE)
Now when we view the structure of our data, we will see a standard data frame format.
str(DSM_TUD_df)
## 'data.frame': 278692 obs. of 3 variables:
## $ x : num 83568 83572 83578 83582 83588 ...
## $ y : num 447178 447178 447178 447178 447178 ...
## $ tud.dsm.5m: num 10.34 8.64 1.25 1.12 2.13 ...
We can use ggplot() to plot this data with another
geom_ function called geom_raster(). The
coord_quickmap() gives a quick approximation that preserves
straight lines. This approximation is suitable for small areas that are
not too close to the poles.
ggplot() +
geom_raster(data = DSM_TUD_df , aes(x = x, y = y, fill = tud.dsm.5m)) +
scale_fill_viridis_c() + # remember, this color palette was introduced in the first lesson
coord_quickmap()
This map, a so-called Digital Surface Model, shows the elevation of our study site, including buildings and vegetation.
For faster previews, you can use the plot()
function.
plot(DSM_TUD)
But what units are these? This information is specified in the CRS, so let’s have a closer look at the CRS of our raster.
With raster objects we use the crs() function from the
raster package.
crs(DSM_TUD)
## Coordinate Reference System:
## Deprecated Proj.4 representation:
## +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889
## +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## WKT2 2019 representation:
## PROJCRS["unknown",
## BASEGEOGCRS["unknown",
## DATUM["Unknown based on Bessel 1841 ellipsoid",
## ELLIPSOID["Bessel 1841",6377397.155,299.1528128,
## LENGTHUNIT["metre",1,
## ID["EPSG",9001]]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8901]]],
## CONVERSION["unknown",
## METHOD["Oblique Stereographic",
## ID["EPSG",9809]],
## PARAMETER["Latitude of natural origin",52.1561605555556,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",5.38763888888889,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",0.9999079,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",155000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",463000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["(E)",east,
## ORDER[1],
## LENGTHUNIT["metre",1,
## ID["EPSG",9001]]],
## AXIS["(N)",north,
## ORDER[2],
## LENGTHUNIT["metre",1,
## ID["EPSG",9001]]]]
We see that the units are in meters in the Proj4 string:
+units=m.
It is useful to know the minimum and maximum values of a raster dataset. In the case of a DSM, those values represent the min/max elevation range of our site.
minValue(DSM_TUD)
## [1] -5.39069
maxValue(DSM_TUD)
## [1] 92.08102
If Min and Max values haven’t been calculated, you can set them with
the raster::setMinMax() function.
DSM_TUD <- raster::setMinMax(DSM_TUD)
To see how many bands a raster dataset has, use the
raster::nlayers() function.
nlayers(DSM_TUD)
## [1] 1
This dataset has only 1 band. We will discuss multi-band raster data in a later episode.
A histogram can be used to inspect the distribution of raster values
visually. It can show if there are values above the max or below the min
of the expected range. We can create a histogram with the ggplot2
function geom_histogram().
ggplot() +
geom_histogram(data = DSM_TUD_df, aes(tud.dsm.5m))
Adjust the level of desired detail by setting the number of bins.
ggplot() +
geom_histogram(data = DSM_TUD_df, aes(tud.dsm.5m), bins = 40)
Looking at the distribution can help us identify bad data values, that is, values that are out of the min/max range.
Use GDALinfo() to determine the following about the
tud-dsm-hill.tif file:
DSM_TUD?GDALinfo(here("data","tud-dsm-5m-hill.tif"))
## rows 386
## columns 722
## bands 1
## lower left origin.x 83565
## lower left origin.y 445250
## res.x 5
## res.y 5
## ysign -1
## oblique.x 0
## oblique.y 0
## driver GTiff
## projection +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889
## +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## file /Users/claudiuforgaci/Projects/geospatial-data-carpentry-tud-2023-06/data/tud-dsm-5m-hill.tif
## apparent band summary:
## GDType hasNoDataValue NoDataValue blockSize1 blockSize2
## 1 Byte TRUE 0 11 722
## apparent band statistics:
## Bmin Bmax Bmean Bsd
## 1 1 255 167.1512 49.22522
## Metadata:
## AREA_OR_POINT=Area
In this part we will plot our raster object using
ggplot2 with customized coloring schemes. In the previous
plot, our DSM was colored with a continuous colour range. For clarity
and visibility, we may prefer to view the data “symbolized” or colored
according to ranges of values. This is comparable to a “classified” map.
For that, we need to tell ggplot how many groups to break
our data into and where those breaks should be. To make these decisions,
it is useful to first explore the distribution of the data using a bar
plot. To begin with, we will use dplyr’s mutate() function
combined with cut() to split the data into 3 bins.
DSM_TUD_df <- DSM_TUD_df %>%
mutate(fct_elevation = cut(tud.dsm.5m, breaks = 3))
ggplot() +
geom_bar(data = DSM_TUD_df, aes(fct_elevation))
To see the cutoff values:
unique(DSM_TUD_df$fct_elevation)
## [1] (-5.49,27.1] (27.1,59.6] (59.6,92.2]
## Levels: (-5.49,27.1] (27.1,59.6] (59.6,92.2]
To show count number of pixels in each group:
DSM_TUD_df %>%
group_by(fct_elevation) %>%
count()
## # A tibble: 3 × 2
## # Groups: fct_elevation [3]
## fct_elevation n
## <fct> <int>
## 1 (-5.49,27.1] 277100
## 2 (27.1,59.6] 1469
## 3 (59.6,92.2] 123
To customize cutoff values:
custom_bins <- c(-10, 0, 5, 100)
head(DSM_TUD_df)
## x y tud.dsm.5m fct_elevation
## 1 83567.5 447177.5 10.343450 (-5.49,27.1]
## 2 83572.5 447177.5 8.637954 (-5.49,27.1]
## 3 83577.5 447177.5 1.250230 (-5.49,27.1]
## 4 83582.5 447177.5 1.124690 (-5.49,27.1]
## 5 83587.5 447177.5 2.130767 (-5.49,27.1]
## 6 83592.5 447177.5 3.377968 (-5.49,27.1]
DSM_TUD_df <- DSM_TUD_df %>%
mutate(fct_elevation_cb = cut(tud.dsm.5m, breaks = custom_bins))
head(DSM_TUD_df)
## x y tud.dsm.5m fct_elevation fct_elevation_cb
## 1 83567.5 447177.5 10.343450 (-5.49,27.1] (5,100]
## 2 83572.5 447177.5 8.637954 (-5.49,27.1] (5,100]
## 3 83577.5 447177.5 1.250230 (-5.49,27.1] (0,5]
## 4 83582.5 447177.5 1.124690 (-5.49,27.1] (0,5]
## 5 83587.5 447177.5 2.130767 (-5.49,27.1] (0,5]
## 6 83592.5 447177.5 3.377968 (-5.49,27.1] (0,5]
unique(DSM_TUD_df$fct_elevation_cb)
## [1] (5,100] (0,5] (-10,0]
## Levels: (-10,0] (0,5] (5,100]
ggplot() +
geom_bar(data = DSM_TUD_df, aes(fct_elevation_cb))
DSM_TUD_df %>%
group_by(fct_elevation_cb) %>%
count()
## # A tibble: 3 × 2
## # Groups: fct_elevation_cb [3]
## fct_elevation_cb n
## <fct> <int>
## 1 (-10,0] 113877
## 2 (0,5] 101446
## 3 (5,100] 63369
ggplot() +
geom_raster(data = DSM_TUD_df , aes(x = x, y = y, fill = fct_elevation_cb)) +
coord_quickmap()
The plot above uses the default colors inside ggplot for raster
objects. We can specify our own colors to make the plot look a little
nicer. R has a built in set of colors for plotting terrain, which are
built in to the terrain.colors() function. Since we have
three bins, we want to create a 3-color palette:
terrain.colors(3)
## [1] "#00A600" "#ECB176" "#F2F2F2"
ggplot() +
geom_raster(data = DSM_TUD_df , aes(x = x, y = y,
fill = fct_elevation_cb)) +
scale_fill_manual(values = terrain.colors(3)) +
coord_quickmap()
my_col <- terrain.colors(3)
ggplot() +
geom_raster(data = DSM_TUD_df , aes(x = x, y = y,
fill = fct_elevation_cb)) +
scale_fill_manual(values = my_col, name = "Elevation") +
coord_quickmap()
Create a plot of the TU Delft Digital Surface Model
(DSM_TUD) that has:
DSM_TUD_df <- DSM_TUD_df %>%
mutate(fct_elevation_6 = cut(tud.dsm.5m, breaks = 6))
unique(DSM_TUD_df$fct_elevation_6)
## [1] (-5.49,10.9] (10.9,27.1] (27.1,43.3] (43.3,59.6] (59.6,75.8]
## [6] (75.8,92.2]
## 6 Levels: (-5.49,10.9] (10.9,27.1] (27.1,43.3] (43.3,59.6] ... (75.8,92.2]
my_col <- terrain.colors(6)
ggplot() +
geom_raster(data = DSM_TUD_df, aes(x = x, y = y,
fill = fct_elevation_6)) +
scale_fill_manual(values = my_col, name = "Elevation") +
coord_quickmap() +
xlab("X") +
ylab("Y") +
labs(title = "Elevation Classes of the Digital Surface Model (DSM)")
We can layer a raster on top of a hillshade raster for the same area, and use a transparency factor to create a 3-dimensional shaded effect. A hillshade is a raster that maps the shadows and texture that you would see from above when viewing terrain. We will add a custom color, making the plot grey.
DSM_hill_TUD <- raster(here("data","tud-dsm-5m-hill.tif"))
DSM_hill_TUD
## class : RasterLayer
## dimensions : 386, 722, 278692 (nrow, ncol, ncell)
## resolution : 5, 5 (x, y)
## extent : 83565, 87175, 445250, 447180 (xmin, xmax, ymin, ymax)
## crs : +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## source : tud-dsm-5m-hill.tif
## names : tud.dsm.5m.hill
## values : 1, 255 (min, max)
DSM_hill_TUD_df <- as.data.frame(DSM_hill_TUD, xy = TRUE)
str(DSM_hill_TUD_df)
## 'data.frame': 278692 obs. of 3 variables:
## $ x : num 83568 83572 83578 83582 83588 ...
## $ y : num 447178 447178 447178 447178 447178 ...
## $ tud.dsm.5m.hill: num NA NA NA NA NA NA NA NA NA NA ...
ggplot() +
geom_raster(data = DSM_hill_TUD_df,
aes(x = x, y = y, alpha = tud.dsm.5m.hill)) +
scale_alpha(range = c(0.15, 0.65), guide = "none") +
coord_quickmap()
We can layer another raster on top of our hillshade by adding another
call to the geom_raster() function. Let’s overlay DSM_TUD
on top of the DSM_hill_TUD.
ggplot() +
geom_raster(data = DSM_TUD_df ,
aes(x = x, y = y,
fill = tud.dsm.5m)) +
geom_raster(data = DSM_hill_TUD_df,
aes(x = x, y = y,
alpha = tud.dsm.5m.hill)) +
scale_fill_viridis_c() +
scale_alpha(range = c(0.15, 0.65), guide = "none") +
ggtitle("Elevation with hillshade") +
coord_quickmap()
Use the tud-dtm.tif and tud-dtm-hill.tif
files from the data directory to create a Digital Terrain
Model map of the TU Delft area.
Make sure to:
# import DTM
DTM_TUD <- raster(here("data","tud-dtm-5m.tif"))
DTM_TUD_df <- as.data.frame(DTM_TUD, xy = TRUE)
# DTM Hillshade
DTM_hill_TUD <- raster(here("data","tud-dtm-5m-hill.tif"))
DTM_hill_TUD_df <- as.data.frame(DTM_hill_TUD, xy = TRUE)
ggplot() +
geom_raster(data = DTM_TUD_df ,
aes(x = x, y = y,
fill = tud.dtm.5m,
alpha = 2.0)
) +
geom_raster(data = DTM_hill_TUD_df,
aes(x = x, y = y,
alpha = tud.dtm.5m.hill)
) +
scale_fill_viridis_c() +
guides(fill = guide_colorbar()) +
scale_alpha(range = c(0.4, 0.7), guide = "none") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank()) +
theme(axis.title.x = element_blank(),
axis.title.y = element_blank()) +
ggtitle("DTM with Hillshade") +
coord_quickmap()
What happens when maps don’t line up? That is usually a sign that layers are in different coordinate reference systems (CRS).
In this episode, we will be working with the digital terrain model.
DTM_TUD <- raster(here("data","tud-dtm-5m.tif"))
DTM_hill_TUD <- raster(here("data","tud-dtm-5m-hill-ETRS89.tif"))
DTM_TUD_df <- as.data.frame(DTM_TUD, xy = TRUE)
DTM_hill_TUD_df <- as.data.frame(DTM_hill_TUD, xy = TRUE)
ggplot() +
geom_raster(data = DTM_TUD_df ,
aes(x = x, y = y,
fill = tud.dtm.5m)) +
geom_raster(data = DTM_hill_TUD_df,
aes(x = x, y = y,
alpha = tud.dtm.5m.hill.ETRS89)) +
scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) +
coord_quickmap()
Our results are curious - neither the Digital Terrain Model
(DTM_TUD_df) nor the DTM Hillshade
(DTM_hill_TUD_df) plotted. Let’s try to plot the DTM on its
own to make sure there are data there.
ggplot() +
geom_raster(data = DTM_TUD_df,
aes(x = x, y = y,
fill = tud.dtm.5m)) +
scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) +
coord_quickmap()
ggplot() +
geom_raster(data = DTM_hill_TUD_df,
aes(x = x, y = y,
alpha = tud.dtm.5m.hill.ETRS89)) +
coord_quickmap()
If we look at the axes, we can see that the projections of the two rasters are different.
View the CRS for each of these two datasets. What projection does each use?
crs(DTM_TUD)
## Coordinate Reference System:
## Deprecated Proj.4 representation:
## +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889
## +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## WKT2 2019 representation:
## PROJCRS["unknown",
## BASEGEOGCRS["unknown",
## DATUM["Unknown based on Bessel 1841 ellipsoid",
## ELLIPSOID["Bessel 1841",6377397.155,299.1528128,
## LENGTHUNIT["metre",1,
## ID["EPSG",9001]]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8901]]],
## CONVERSION["unknown",
## METHOD["Oblique Stereographic",
## ID["EPSG",9809]],
## PARAMETER["Latitude of natural origin",52.1561605555556,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",5.38763888888889,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",0.9999079,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",155000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",463000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["(E)",east,
## ORDER[1],
## LENGTHUNIT["metre",1,
## ID["EPSG",9001]]],
## AXIS["(N)",north,
## ORDER[2],
## LENGTHUNIT["metre",1,
## ID["EPSG",9001]]]]
crs(DTM_hill_TUD)
## Coordinate Reference System:
## Deprecated Proj.4 representation:
## +proj=laea +lat_0=52 +lon_0=10 +x_0=4321000 +y_0=3210000 +ellps=GRS80
## +units=m +no_defs
## WKT2 2019 representation:
## PROJCRS["unknown",
## BASEGEOGCRS["unknown",
## DATUM["Unknown based on GRS80 ellipsoid",
## ELLIPSOID["GRS 1980",6378137,298.257222101,
## LENGTHUNIT["metre",1],
## ID["EPSG",7019]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8901]]],
## CONVERSION["unknown",
## METHOD["Lambert Azimuthal Equal Area",
## ID["EPSG",9820]],
## PARAMETER["Latitude of natural origin",52,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",10,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["False easting",4321000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",3210000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["(E)",east,
## ORDER[1],
## LENGTHUNIT["metre",1,
## ID["EPSG",9001]]],
## AXIS["(N)",north,
## ORDER[2],
## LENGTHUNIT["metre",1,
## ID["EPSG",9001]]]]
We can use the projectRaster() function to reproject a
raster into a new CRS. Keep in mind that reprojection only works when
you first have a defined CRS for the raster object that you want to
reproject. It cannot be used if no CRS is defined.
DTM_hill_EPSG28992_TUD <- projectRaster(DTM_hill_TUD,
crs = crs(DTM_TUD))
crs(DTM_hill_EPSG28992_TUD)
## Coordinate Reference System:
## Deprecated Proj.4 representation:
## +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889
## +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## WKT2 2019 representation:
## PROJCRS["unknown",
## BASEGEOGCRS["unknown",
## DATUM["Unknown based on Bessel 1841 ellipsoid",
## ELLIPSOID["Bessel 1841",6377397.155,299.1528128,
## LENGTHUNIT["metre",1,
## ID["EPSG",9001]]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8901]]],
## CONVERSION["unknown",
## METHOD["Oblique Stereographic",
## ID["EPSG",9809]],
## PARAMETER["Latitude of natural origin",52.1561605555556,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",5.38763888888889,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",0.9999079,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",155000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",463000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["(E)",east,
## ORDER[1],
## LENGTHUNIT["metre",1,
## ID["EPSG",9001]]],
## AXIS["(N)",north,
## ORDER[2],
## LENGTHUNIT["metre",1,
## ID["EPSG",9001]]]]
crs(DTM_hill_TUD)
## Coordinate Reference System:
## Deprecated Proj.4 representation:
## +proj=laea +lat_0=52 +lon_0=10 +x_0=4321000 +y_0=3210000 +ellps=GRS80
## +units=m +no_defs
## WKT2 2019 representation:
## PROJCRS["unknown",
## BASEGEOGCRS["unknown",
## DATUM["Unknown based on GRS80 ellipsoid",
## ELLIPSOID["GRS 1980",6378137,298.257222101,
## LENGTHUNIT["metre",1],
## ID["EPSG",7019]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8901]]],
## CONVERSION["unknown",
## METHOD["Lambert Azimuthal Equal Area",
## ID["EPSG",9820]],
## PARAMETER["Latitude of natural origin",52,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",10,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["False easting",4321000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",3210000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["(E)",east,
## ORDER[1],
## LENGTHUNIT["metre",1,
## ID["EPSG",9001]]],
## AXIS["(N)",north,
## ORDER[2],
## LENGTHUNIT["metre",1,
## ID["EPSG",9001]]]]
extent(DTM_hill_EPSG28992_TUD)
## class : Extent
## xmin : 83395.44
## xmax : 87301.54
## ymin : 444886.2
## ymax : 447328
extent(DTM_hill_TUD)
## class : Extent
## xmin : 3933146
## xmax : 3936870
## ymin : 3223825
## ymax : 3225980
Why do you think the two extents differ?
res(DTM_hill_EPSG28992_TUD)
## [1] 5.30 4.66
res(DTM_TUD)
## [1] 5 5
These two resolutions are different, but they’re representing the
same data. We can tell R to force our newly reprojected raster to be the
same as DTM_TUD by using res(DTM_TUD).
DTM_hill_EPSG28992_TUD <- projectRaster(DTM_hill_TUD,
crs = crs(DTM_TUD),
res = res(DTM_TUD))
res(DTM_hill_EPSG28992_TUD)
## [1] 5 5
res(DTM_TUD)
## [1] 5 5
DTM_hill_TUD_2_df <- as.data.frame(DTM_hill_EPSG28992_TUD, xy = TRUE)
ggplot() +
geom_raster(data = DTM_TUD_df ,
aes(x = x, y = y,
fill = tud.dtm.5m)) +
geom_raster(data = DTM_hill_TUD_2_df,
aes(x = x, y = y,
alpha = tud.dtm.5m.hill.ETRS89)) +
scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) +
coord_quickmap()
We often want to perform calculations on two or more rasters to create a new output raster. For example, if we are interested in mapping the heights of trees and buildings across an entire field site, we might want to calculate the difference between the Digital Surface Model (DSM, tops of trees and buildings) and the Digital Terrain Model (DTM, ground level). The resulting dataset is referred to as a Canopy Height Model (CHM) and represents the actual height of trees, buildings, etc. with the influence of ground elevation removed.
GDALinfo(here("data","tud-dtm-5m.tif"))
## rows 386
## columns 722
## bands 1
## lower left origin.x 83565
## lower left origin.y 445250
## res.x 5
## res.y 5
## ysign -1
## oblique.x 0
## oblique.y 0
## driver GTiff
## projection +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889
## +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## file /Users/claudiuforgaci/Projects/geospatial-data-carpentry-tud-2023-06/data/tud-dtm-5m.tif
## apparent band summary:
## GDType hasNoDataValue NoDataValue blockSize1 blockSize2
## 1 Float32 FALSE 0 2 722
## apparent band statistics:
## Bmin Bmax Bmean Bsd
## 1 -5.894642 16.27887 -0.5170563 1.158072
## Metadata:
## AREA_OR_POINT=Area
GDALinfo(here("data","tud-dsm-5m.tif"))
## rows 386
## columns 722
## bands 1
## lower left origin.x 83565
## lower left origin.y 445250
## res.x 5
## res.y 5
## ysign -1
## oblique.x 0
## oblique.y 0
## driver GTiff
## projection +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889
## +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## file /Users/claudiuforgaci/Projects/geospatial-data-carpentry-tud-2023-06/data/tud-dsm-5m.tif
## apparent band summary:
## GDType hasNoDataValue NoDataValue blockSize1 blockSize2
## 1 Float32 FALSE 0 2 722
## apparent band statistics:
## Bmin Bmax Bmean Bsd
## 1 -5.39069 92.08102 2.588572 5.45896
## Metadata:
## AREA_OR_POINT=Area
ggplot() +
geom_raster(data = DTM_TUD_df ,
aes(x = x, y = y, fill = tud.dtm.5m)) +
scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) +
coord_quickmap()
ggplot() +
geom_raster(data = DSM_TUD_df ,
aes(x = x, y = y, fill = tud.dsm.5m)) +
scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) +
coord_quickmap()
CHM_TUD <- DSM_TUD - DTM_TUD
CHM_TUD_df <- as.data.frame(CHM_TUD, xy = TRUE)
ggplot() +
geom_raster(data = CHM_TUD_df ,
aes(x = x, y = y, fill = layer)) +
scale_fill_gradientn(name = "Canopy Height", colors = terrain.colors(10)) +
coord_quickmap()
ggplot(CHM_TUD_df) +
geom_histogram(aes(layer))
It’s often a good idea to explore the range of values in a raster dataset just like we might explore a dataset that we collected in the field.
CHM_TUD that we just created?CHM_TUD?min(CHM_TUD_df$layer, na.rm = TRUE)
## [1] -3.638057
max(CHM_TUD_df$layer, na.rm = TRUE)
## [1] 92.08102
ggplot(CHM_TUD_df) +
geom_histogram(aes(layer))
ggplot(CHM_TUD_df) +
geom_histogram(aes(layer), colour="black",
fill="darkgreen", bins = 6)
custom_bins <- c(0, 10, 20, 30, 100)
CHM_TUD_df <- CHM_TUD_df %>%
mutate(canopy_discrete = cut(layer, breaks = custom_bins))
ggplot() +
geom_raster(data = CHM_TUD_df , aes(x = x, y = y,
fill = canopy_discrete)) +
scale_fill_manual(values = terrain.colors(4)) +
coord_quickmap()
writeRaster(CHM_TUD, here("fig_output","CHM_TUD.tiff"),
format="GTiff",
overwrite=TRUE,
NAflag=-9999)
In this episode, the multi-band data that we are working with is high resolution imagery for the Netherlands.
The raster() function only reads in the first band, in
this case the red band of an RGB raster.
RGB_band1_TUD <- raster(here("data","tudlib-rgb.tif"))
RGB_band1_TUD_df <- as.data.frame(RGB_band1_TUD, xy = TRUE)
ggplot() +
geom_raster(data = RGB_band1_TUD_df,
aes(x = x, y = y, alpha = tudlib.rgb_1)) +
coord_quickmap() # use `coord_equal()` instead
RGB_band1_TUD
## class : RasterLayer
## band : 1 (of 3 bands)
## dimensions : 4988, 4866, 24271608 (nrow, ncol, ncell)
## resolution : 0.08, 0.08 (x, y)
## extent : 85272, 85661.28, 446295.2, 446694.2 (xmin, xmax, ymin, ymax)
## crs : +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## source : tudlib-rgb.tif
## names : tudlib.rgb_1
nbands(RGB_band1_TUD)
## [1] 3
To import the green band:
RGB_band2_TUD <- raster(here("data","tudlib-rgb.tif"), band = 2)
RGB_band2_TUD_df <- as.data.frame(RGB_band2_TUD, xy = TRUE)
ggplot() +
geom_raster(data = RGB_band2_TUD_df,
aes(x = x, y = y, alpha = tudlib.rgb_1)) +
coord_equal()
There is a better way of reading in all bands. The
stack() function brings in all bands
RGB_stack_TUD <- stack(here("data","tudlib-rgb.tif"))
RGB_stack_TUD
## class : RasterStack
## dimensions : 4988, 4866, 24271608, 3 (nrow, ncol, ncell, nlayers)
## resolution : 0.08, 0.08 (x, y)
## extent : 85272, 85661.28, 446295.2, 446694.2 (xmin, xmax, ymin, ymax)
## crs : +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## names : tudlib.rgb_1, tudlib.rgb_2, tudlib.rgb_3
RGB_stack_TUD@layers
## [[1]]
## class : RasterLayer
## band : 1 (of 3 bands)
## dimensions : 4988, 4866, 24271608 (nrow, ncol, ncell)
## resolution : 0.08, 0.08 (x, y)
## extent : 85272, 85661.28, 446295.2, 446694.2 (xmin, xmax, ymin, ymax)
## crs : +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## source : tudlib-rgb.tif
## names : tudlib.rgb_1
##
##
## [[2]]
## class : RasterLayer
## band : 2 (of 3 bands)
## dimensions : 4988, 4866, 24271608 (nrow, ncol, ncell)
## resolution : 0.08, 0.08 (x, y)
## extent : 85272, 85661.28, 446295.2, 446694.2 (xmin, xmax, ymin, ymax)
## crs : +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## source : tudlib-rgb.tif
## names : tudlib.rgb_2
##
##
## [[3]]
## class : RasterLayer
## band : 3 (of 3 bands)
## dimensions : 4988, 4866, 24271608 (nrow, ncol, ncell)
## resolution : 0.08, 0.08 (x, y)
## extent : 85272, 85661.28, 446295.2, 446694.2 (xmin, xmax, ymin, ymax)
## crs : +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## source : tudlib-rgb.tif
## names : tudlib.rgb_3
RGB_stack_TUD[[2]]
## class : RasterLayer
## band : 2 (of 3 bands)
## dimensions : 4988, 4866, 24271608 (nrow, ncol, ncell)
## resolution : 0.08, 0.08 (x, y)
## extent : 85272, 85661.28, 446295.2, 446694.2 (xmin, xmax, ymin, ymax)
## crs : +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## source : tudlib-rgb.tif
## names : tudlib.rgb_2
RGB_stack_TUD_df <- as.data.frame(RGB_stack_TUD, xy = TRUE)
str(RGB_stack_TUD_df)
## 'data.frame': 24271608 obs. of 5 variables:
## $ x : num 85272 85272 85272 85272 85272 ...
## $ y : num 446694 446694 446694 446694 446694 ...
## $ tudlib.rgb_1: num 52 48 47 49 47 45 47 48 49 54 ...
## $ tudlib.rgb_2: num 64 58 57 60 57 55 58 59 62 69 ...
## $ tudlib.rgb_3: num 57 49 49 53 50 47 51 54 57 68 ...
ggplot() +
geom_histogram(data = RGB_stack_TUD_df, aes(tudlib.rgb_1))
ggplot() +
geom_raster(data = RGB_stack_TUD_df,
aes(x = x, y = y, alpha = tudlib.rgb_2)) +
coord_equal()
To create an RGB image, we will use the plotRGB function
from the raster package.
plotRGB(RGB_stack_TUD,
r = 1, g = 2, b = 3)
The image above looks pretty good. We can explore whether applying a
stretch to the image might improve clarity and contrast using
stretch="lin" or stretch="hist".
plotRGB(RGB_stack_TUD,
r = 1, g = 2, b = 3,
scale = 800,
stretch = "lin")
plotRGB(RGB_stack_TUD,
r = 1, g = 2, b = 3,
scale = 800,
stretch = "hist")
The R RasterStack and RasterBrick object types can both store multiple bands. However, how they store each band is different. The bands in a RasterStack are stored as links to raster data that is located somewhere on our computer. A RasterBrick contains all of the objects stored within the actual R object. In most cases, we can work with a RasterBrick in the same way we might work with a RasterStack. However a RasterBrick is often more efficient and faster to process - which is important when working with larger files.
object.size(RGB_stack_TUD)
## 56712 bytes
RGB_brick_TUD <- brick(RGB_stack_TUD)
object.size(RGB_brick_TUD)
## 17128 bytes
plotRGB(RGB_brick_TUD)